首页> 外文OA文献 >A Parallel Tree code for large N-body simulation: dynamic load balance and data distribution on CRAY T3D system
【2h】

A Parallel Tree code for large N-body simulation: dynamic load balance and data distribution on CRAY T3D system

机译:用于大型N体仿真的并行树代码:CRAY T3D系统上的动态负载平衡和数据分配

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

N-body algorithms for long-range unscreened interactions like gravity belong to a class of highly irregular problems whose optimal solution is a challenging task for present-day massively parallel computers. In this paper we describe a strategy for optimal memory and work distribution which we have applied to our parallel implementation of the Barnes & Hut (1986) recursive tree scheme on a Cray T3D using the CRAFT programming environment. We have performed a series of tests to find an " optimal data distribution " in the T3D memory, and to identify a strategy for the " Dynamic Load Balance " in order to obtain good performances when running large simulations (more than 10 million particles). The results of tests show that the step duration depends on two main factors: the data locality and the T3D network contention. Increasing data locality we are able to minimize the step duration if the closest bodies (direct interaction) tend to be located in the same PE local memory (contiguous block subdivison, high granularity), whereas the tree properties have a fine grain distribution. In a very large simulation, due to network contention, an unbalanced load arises. To remedy this we have devised an automatic work redistribution mechanism which provided a good Dynamic Load Balance at the price of an insignificant overhead.
机译:用于远程非屏蔽交互(如重力)的N体算法属于一类高度不规则的问题,对于当今的大规模并行计算机,其最佳解决方案是一项具有挑战性的任务。在本文中,我们描述了一种用于最佳内存和工作分配的策略,该策略已应用于使用CRAFT编程环境在Cray T3D上并行实现Barnes&Hut(1986)递归树方案。我们进行了一系列测试,以在T3D内存中找到“最佳数据分布”,并确定“动态负载平衡”的策略,以便在运行大型模拟(超过1000万个粒子)时获得良好的性能。测试结果表明,步长持续时间取决于两个主要因素:数据局部性和T3D网络竞争。如果最接近的物体(直接交互作用)倾向于位于相同的PE本地内存中(连续块细分,高粒度),而树的属性具有良好的粒度分布,则增加数据局部性可以使步长最小化。在非常大的模拟中,由于网络争用,会出现不平衡的负载。为了解决这个问题,我们设计了一种自动工作重新分配机制,该机制以不大的开销为代价提供了良好的动态负载平衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号